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We examine several recently suggested methods for the detection of long-range cor- 
relations in data series based on similar ideas as the well-established Detrended 
Fluctuation Analysis (DFA). In particular, we present a detailed comparison be- 
^^ tween the regular DFA and two recently suggested methods: the Centered Moving 

(^ \ Average (CMA) Method and a Modified Detrended Fluctuation Analysis (MDFA). 

■^ ' We find that CMA is performing equivalently as DFA in long data with weak trends 
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^^ , standard DFA to MDFA we observe that DFA performs slightly better in almost 

Q \ all examples we studied. We also discuss how several types of trends affect the dif- 

ferent types of DFA. For weak trends in the data, the new methods are comparable 
with DFA in these respects. However, if the functional form of the trend in data is 
r> \ not a-priori known, DFA remains the method of choice. Only a comparison of DFA 
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1 Introduction 



Experimental data are often affected by non-stationarities, i.e. varying mean 
and standard deviation. These effects have to be well distinguished from the 
intrinsic fluctuations and correlations of the system in order to find the correct 
scaling behaviour. Sometimes we do not know the reasons for underlying non- 
stationarities in collected data and - even worse - we do not know the type 
of the underlying non-stationarities. 

In the last decade Detrended Fluctuation Analysis (DFA), originally intro- 
duced by Peng et al. [1], has been established as an important method to 
reliably detect long-range (auto-) correlation^ in data effected by trends. 
The method is based on random walk theory. Its non-detrending predecessors 
are Hurst's rescaled range analysis [2] and fluctuation analysis (FA) [3]. DFA 
was later generalized for higher order detrending [4], multifractal analysis [5], 
separate analysis of sign and magnitude series [6] , and data with more than one 
dimension [7]. Its features have been studied in many articles [8,9,10,11,12,13]. 
In addition, several comparisons of DFA with other methods for stationary and 
non-stationary time-series analysis have been published, see, e.g., [14,15,16,17] 
and in particular [18], where DFA is compared with many other established 
methods for short data sets. 

The convenience of DFA has led to a broad range of application in very diverse 
fields where long-range correlations are of interest: 

• DNA sequences, 

• medical and physiological time series (recordings of heartbeat, breathing, 
blood pressure, blood flow, nerve spike intervals, human gait, glucose levels, 
gene expression data), 

• geophysics time series (recordings of temperature, precipitation, water runoff, 
ozone levels, wind speed, seismic events, vegetational patterns, and climate 
dynamics) , 

• astrophysical time series (X-ray light sources and sunspot numbers), 

• technical time series (internet traffic, highway traffic, and neutronic power 
from a reactor), 

• social time series (finance and economy, language characteristics, fatalities 
in the Iraq conflict), as well as 

• physics data, e.g., surface roughness, chaotic spectra of atoms, and photon 
correlation spectroscopy recordings. 

Altogether, there are about 450 papers applying DFA. In most cases posi- 
tive auto-correlations were reported leaving only a few exceptions with anti- 



^ In the following we will label long-range as long-term when speaking, more specif- 
ically, about time series. 



correlations, see, e.g., [19,20,21]. 

In the DFA technique - as in all techniques based on random walk theory - 
time series are integrated by partial summation which enables also the anal- 
ysis of data with weak correlations. In addition the integration reduces the 
noise level caused by imperfect measurements and noise, an advantage that 
applies also to other related non-detrending methods [2,3]. However, for the 
reliable characterization of time series, it is also essential to distinguish trends 
from intrinsic fluctuations, that might be long-term correlated. Monotonous, 
periodic or step-like trends are caused by external effects, e. g., by the green- 
house warming [22], seasonal variations for temperature records [23] and river 
runoffs [2,24,25,26], different levels of daily activity in long-term physiological 
data [27], or unstable light sources in photon correlation spectroscopy [28]. To 
characterize a complex system based on time series, trends and fluctuations 
are usually studied separately (see, e.g., [29] for a recent discussion). Strong 
trends in data can lead to a false detection of long-term correlations if only 
one (non-detrending) method is used or if the results are not carefully inter- 
preted. A major advantage of the DFA technique is the systematic elimination 
of polynomial trends of different order [4,8,9]. Note however that an additive 
composition of fluctuations and trends is assumed. The technique can thus 
assist in gaining insight into the scaling behaviour of the natural variability 
as well as into the kind of trends of the considered time series [30] . 

Still, we would like to note that conclusions should not be based on DFA 
or variants of this method alone in most applications. In particular, if it is 
not clear whether a given time series is indeed long-term correlated or just 
short-term correlated with a fairly large correlation time scale, results of DFA 
should be compared with other methods. For example, one can employ wavelet 
methods (see, e.g., [23,25,31,32,33]). Another option is to remove short-term 
correlations by considering averaged series for comparison. For a time series 
with daily observations and possible short-term correlations up to two years, 
for example, one might consider the series of two-year averages and apply 
DFA as well as FA, Hurst's Analysis, binned power spectra analysis, and/or 
wavelet analysis. Only if at least two independent methods consistently indi- 
cate long-term correlations, one can be sure that the data are indeed long-term 
correlated. 

Lately, several modifications of the DFA method have been suggested with 
many different techniques for the elimination of monotonous and periodic 
trends. These methods include 

• the Detrended Moving Average technique [34,35,36], which we denote by 
Backward Moving Average (BMA) technique (following [37]), 

• the Centered Moving Average (CMA) method [37], an essentially improved 
version of BMA, 



• the Modified Detrended Fluctuation Analysis (MDFA) [38], which is essen- 
tially a mixture of old FA and DFA, 

• the continuous DFA (CDFA) technique [39,40], which is particularly in- 
tended for the detection of transitions, 

• the Fourier DFA [41], 

• a variant of DFA based on empirical mode decomposition (EMD) [42], 

• a variant of DFA based on singular value decomposition (SVD) [43,44], and 

• a variant of DFA based on high-pass filtering [45]. 

Although several of the original publications compare their new suggested 
method with the DFA, there is no inter-comparison between these new meth- 
ods. Hence, it is not clear which methods might be most suitable for which 
application. In this work we comment on all recently suggested detrending 
random walk based methods we are aware of. Moreover, we study and com- 
pare in detail two of the most prominent and - according to our studies - 
most suitable new methods with standard DFA, presenting their advantages 
and disadvantages. For recent comparative studies not focused on detrending 
methods, see [14,17,18]. For studies comparing DFA and BMA, see [46,47]; 
note that [47] also discusses CMA. For studies comparing methods for detrend- 
ing multifractal analysis (multifractal DFA (MF-DFA) and wavelet transform 
modulus maxima (WTMM) method), see [5,24,48]. 

The paper is organized as follows: In Section 2 we thoroughly explain the 
standard DFA as well as the Centered Moving Average (CMA) method, and a 
Modified Detrended Fluctuation Analysis (MDFA). We further introduce sev- 
eral other (more complicated) detrending methods and remark on their utility. 
Section 3 reports and discusses our results for DFA, CMA and MDFA, ob- 
tained from monofractal artificial time series with different lengths, crossovers 
and monotonous trends. We conclude in Section 4. 



2 Methods 



2.1 Long-Range Correlations 



We consider a record (xj) oi i = 1, . . . ,N equidistant measurements. In most 
applications, the index i will correspond to the time of the measurements. 
We are interested in the correlation of the values Xi and Xj+s for different 
time lags, i. e. correlations over different time scales s. In order to remove a 
constant offset in the data, the mean {x) = -^ Y^iLi ^i is usually subtracted, 
Xi = Xi — {x). Quantitatively, correlations between x- values separated by s 



steps are defined by the (auto-) correlation function 
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If the Xi are uncorrelated, C(s) is zero for s > 0. Short-range correlations 
of the Xi are described by C{s) declining exponentially, C{s) ~ exp(— s/tx) 
with a decay time t^. For so-called long-range correlations, t-^ = JI^C{s)ds 
diverges and the decay time tx cannot be defined. For example, C{s) declines 
as a power-law 

C{s) ~ s'^ (2) 



with an exponent < 7 < 1. A direct calculation of C{s) is usually not 
appropriate due to underlying non-stationarities and trends of unknown origin. 
Furthermore, C{s) strongly fiuctuates around zero on large scales s, making 
it impossible to find the potential scaling behaviour (2). Thus, one has to 
determine the correlation exponent 7 indirectly. 

Note that in some applications a separate inspection of short-term and long- 
term correlations is desirable. A convenient way to exclude short-term corre- 
lations up to a scale s; is downsampling the original data by the same factor 
Si. Contrariwise, the segmentation of the data into boxes of length s„ and 
a subsequent shuffling of the boxes destroys long-term correlations on scales 
above s„. 



2.2 Detrended Fluctuation Analysis (DFA) 



The method of Detrended Fluctuation Analysis (DFA) [1] is an improvement 
of classical ffuctuation analysis (FA) [3], which is similar to Hurst's rescaled 
range {R/S-) analysis [2]. They allow determining the correlation properties 
on large time scales. All three methods are based on random walk theory. One 
first calculates the 'profile' 

n 

X{n) = Y.{x. - (x)) (3) 

of a time series (xi), i = 1, . . . ,N (with mean (x)), which can be considered 
as the position of a random walker on a linear chain after n steps. 

Then the profile is divided into Ng = [N/s] non-overlapping segments of equal 
length ('scale') s. The mean-squared fluctuation function of the FA method is 



given by 



1 ^' 



F\s) = —Y.[Xiiiy-l)s)-Xius)r. (4) 



Ns^=, 



In Hurst's R/S analysis (see [17] and references therein for a recently sug- 
gested improved version and tests), one calculates in each segment z/ the 
range R of X{n) given by the difference between maximal and minimal value, 
R{s) = Xmax — -^min- The "rescaliug of range" is done by dividing R{s) by the 
corresponding standard deviation S{s) = o"(X(n)) of the same segment u. The 
mean of all quotients at a particular scale s is equivalent to F{s) (except for 
multi-fractal data) and usually shows a power-law scaling relationship with s. 

While both, FA and Hurst's method fail to determine correlation properties 
if linear or higher order trends are present in the data, DFA explicitly deals 
with monotonous trends in a detrending procedure. This is done by estimating 
a piecewise polynomial trend yf^iji) within each segment u by least-square 
fitting. I.e., y^\n) consists of concatenated polynomials of order p which are 
calculated separately for each of the segments. The detrended profile function 
Xs{n) on scale s is determined by ('detrending'): 

X,{n)=X{n)-yiP\n). (5) 



The degree of the polynomial can be varied in order to eliminate linear {p = 1), 
quadratic {p = 2) or higher order trends of the profile function [4] . Convention- 
ally the DFA is named after the order of the fitting polynomial (DFAl, DFA2, 
...). Note that DFAl is equivalent to Hurst's analysis in terms of detrending. 

The variance of Xs[n) yields the fluctuation function on scale s 

( 1 ^ ~ 1 ^/^ 
^(^)= l^E^'H • (6) 



This function, which has to be calculated for different scales s, corresponds 
to the trend-eliminated root mean square displacement of the random walker 
mentioned above and is related to the auto-correlation function by an integral 
expression [1]; see the appendix of [14] for a derivation for DFAl. For an 
equivalent, but more common description of DFA, see, e.g., [8]. We note that 
in studies that include averaging over many records (or one record cut into 
many separate pieces by the elimination of some unreliable intermediate data 
points) the averaging procedure (6) must be performed for all data. Taking 
the square root should usually be the flnal step after all averaging is flnished; 
however note [16,17], where this order is reversed. It is usually not appropriate 



to calculate F{s) for parts of the data and then average the F{s) values, since 
such a procedure will bias the results towards smaller scaling exponents on 
large time scales close to the maximum scale Smax ~ N/4 where statistically 
reliable results can be obtained [8]. 

If F{s) increases for increasing s asymptotically as 

F{s) ~ s" (7) 



with 0.5 < a < 1, one finds that the scaling (or 'Hurst') exponent a is related 
to the correlation exponent 7 by a = 1 — 7/2 [14]. A value of a = 0.5 thus 
indicates that there are no (or only short-term) correlations. If a > 0.5, the 
data are long-term correlated. The higher a, the stronger are the correlations 
in the signal. Note that a > 1 indicates a non-stationary local average of the 
data; in this case both, FA and Hurst analysis fail and yield only a = 1. The 
case a < 0.5 corresponds to long-term anticorrelations, meaning that large 
values are most likely to be followed by small values and vice versa [19,20,21]. 

If the type of trends in given data is not known beforehand, the fluctuation 
function F{s) should be calculated for several orders p of the fitting polyno- 
mial. If p is too low, F{s) will show a pronounced crossover to a regime with 
larger slope for large scales s [8,9]. The maximum slope of log -F(s) versus 
logs is p + 1. The crossover will move to larger scales s or disappear when 
p is increased, unless it is a real crossover in the intrinsic fluctuations and 
not due to trends [8]. Hence, one can find p such that detrending is sufficient. 
However, p should not be larger than necessary, because deviations on short 
scales s increase with increasing p. 



2.3 Centered Moving Average (CM A) Analysis 



A possible drawback of the DFA method is the occurrence of abrupt jumps 
in the detrended profile Xs{n) (Eq. (5)) at the boundaries between the seg- 
ments, since the fitting polynomials in neighbouring segments are not related. 
A simple way to avoid these jumps would be the calculation of F{s) based on 
polynomial fits in overlapping windows. However, this is rather time consum- 
ing due to the polynomial fit in each segment and is consequently not done 
in most applications. To overcome the problem of artificial jumps and to re- 
liable determine the scaling exponent a in non-stationary time series, several 
modifications of the FA and DFA methods were suggested in the last years. 

A particular attractive modification leads to the methods of Detrended Moving 
Average (DMA), where running averages replace the polynomial fits. Its first 
suggested version, the Backward Moving Average (BMA) method [34,35,36], 



however, slightly underestimates the scaling exponent by about 0.05, because 
an artificial time shift of s between the original signal and the moving average 
is introduced. This time shift leads to an additional contribution to Xs{n), 
which causes a larger fiuctuation function F{s) in particular for small scales 
in the case of long-term correlated data [46] . In addition, the BMA method is 
effectively not detrending [47]. Its slope a is limited by 1 just as for the earlier 
non-detrending methods FA and R/S. 

It was soon recognized that the artificial time shift of the BMA method 
can easily be eliminated. This leads to the Centered Moving Average (CMA) 
method [37], where Xs{n) is calculated as 



I (s-i)/2 



X,(n)=X(n)-- J2 X{n + j 



^ j=-{s~l)/2 



while Eq. (6) stays the same. Unlike DFA, the CMA method cannot easily 
be generalized to remove linear and higher order trends in the data. However, 
CMA is somehow similar to DFAl with overlapping windows. 



2.4 Modified Detrended Fluctuation Analysis (MDFA) 



Another type of detrended fiuctuation analysis, which we will denote as Mod- 
ified Detrended Fluctuation Analysis (MDFA) [38], eliminates trends similar 
to the DFA method. A polynomial is fitted to the profile function X{n) in 
each segment v and the deviation between the profile function and the poly- 
nomial fit is calculated, Xs{n) = X(n) — y'f^n). To estimate correlations in 
the data, this method uses a derivative of Xs{n), obtained for each segment 
u, by AXs(n) = Xgin + s/2) — Xs{n). Hence, Eq. (6) becomes 



r 1 ^ ~ V^ 

ns) = \j^Y.\^XM)^ . (9) 



As in case of DFA, MDFA can easily be generalized to remove higher order 
trends in the data. Since the fitting polynomials in adjacent segments are not 
related, AXs{n) shows abrupt jumps on their boundaries as well. This leads 
to fluctuations of F{s) for large segment sizes s as we will show below. 



2. 5 Further Modifications and Extensions of DFA 



Several modifications and extensions of DFA liave been proposed. Most of 
them are, however, rather comphcated in implementation. While they might 
be very useful in particular applications, we believe the implications of the 
complicated detrending and decomposition techniques are not sufficiently un- 
derstood and their programming effort is too large for a wide usage. 

The Fourier- detrended fluctuation analysis [41] aims to eliminate slow oscil- 
latory trends which are found especially in weather and climate series due to 
seasonal influences. The character of these trends can be rather periodic and 
regular or irregular, and their influence on the detection of long-range corre- 
lations by means of DFA was systematically studied previously [8]. Among 
other things it has been shown that low-frequency periodic trends disturb the 
scaling behaviour of the results much stronger than high-frequency trends and 
thus have to be removed prior to the analysis. In case of periodic and regular 
oscillations, e.g., in temperature fluctuations one simply removes the low fre- 
quency seasonal trend by subtracting the daily mean temperatures from the 
data. Another way, which the Fourier-detrended fluctuation analysis suggests, 
is to filter out the relevant frequencies in the signals' Fourier spectrum before 
applying DFA to the filtered signal. Nevertheless, this method which is only an 
extension of DFA faces several difficulties especially its limitation to periodic 
and regular trends. Furthermore one needs to know the interfering frequency 
band beforehand. 

To study correlations in data with quasi-periodic or irregular oscillating trends, 
empirical mode decomposition (EMD) was suggested [42]. The EMD algorithm 
breaks down the signal into its intrinsic mode functions (IMFs) which can be 
used to distinguish between fiuctuations and trends. The trends, estimated 
by a quasi-periodic fit containing the dominating frequencies of a sufficiently 
large number of IMFs, is subtracted from the data, yielding a slightly better 
scaling behaviour in the DFA curves. However, we believe, that this extension 
of DFA is too complicated for wide-spread applications. 

Another extension of DFA which was shown to minimize the effect of peri- 
odic and quasi-periodic trends is based on singular value decomposition (SVD) 
[43,44]. In this approach, one first embeds the original signal in a matrix whose 
dimension has to be much larger than the number of frequency components 
of the periodic or quasi-periodic trends obtained in the power spectrum. Ap- 
plying SVD yields a diagonal matrix which can be manipulated by setting 
the dominant eigen-values (associated with the trends) to zero. The filtered 
matrix finally leads to the filtered data, and it has been shown that subse- 
quent application of DFA determines the expected scaling behaviour if the 
embedding dimension is sufficiently large. None the less, the performance of 



this rather complex method seems to decrease for larger values of the scaling 
exponent. Furthermore SVD-DFA assumes that trends are deterministic and 
narrow banded. 

Nevertheless the above-mentioned extensions of DFA show the need for a 
fluctuation analysis that can also handle oscillatory data automatically. The 
detrending procedure in DFA (Eq. (5)) can be regarded as a scale-dependent 
high-pass filter since (low- frequency) fluctuations exceeding a speciflc scale 
s are eliminated. Therefore, it has been suggested to obtain the detrended 
profile Xs{i) for each scale s directly by applying digital high-pass filters [45]. 
In particular, Butterworth, Chebyshev-I, Chebyshev-II, and an elliptical filter 
were suggested. While the elliptical filter showed the best performance in 
detecting long-range correlations in artificial data, the Chebyshev-II filter was 
found to be problematic. Additionally, in order to avoid a time shift between 
filtered and original profile, the average of the directly filtered signal and the 
time reversed filtered signal is considered. The effects of these complicated 
filters on the scaling behaviour are, however, not fully understood. 

Finally, a continuous DFA method has been suggested in the context of study- 
ing heartbeat data during sleep [39,40]. The method compares unnormalized 
fiuctuation functions F{s) for increasing length of the data. I.e., one starts 
with a very short recording and subsequently adds more points of data. The 
method is particularly suitable for the detection of change points in the data, 
e.g., physiological transitions between different activity or sleep stages. Since 
the main objective of the method is not the study of scaling behaviour, we do 
not discuss it in detail in this comparison. 



3 Results 



3. 1 Estimating the scaling behaviour in long and short data sets 



In the first part of our comparison between DFA, CMA and MDFA, we cal- 
culate the scaling exponent a for long-range correlated normally distributed 
data sets of length A^ = 50000. The data sets are generated using the modified 
Fourier filtering method, see, e. g., [49]. As one can see in Fig. 1(a), all three 
methods give sufficiently good results for different values of a. However, on 
closer examination, i. e., looking at the successive slopes (logarithmic point 
to point derivatives) of F{s) (Figs. l(b)-(d)), it can be seen that DFAl and 
MDFAl systematically overestimate the scaling exponent for small scales s. 
This effect has already been discussed for DFA and a modification was sug- 
gested for removing this artifact [8,50]. In addition, the significant fiuctuations 
of the successive slopes of DFAl (and MDFAl) on large scales s, led to the 



10 



10' 



^10° 

'H 

3 

^10-' 



P-i -4 

10 



nO'^ ^ — '-'• ' a a qO--' 



0.6 Kb) 
^0.3 



00 

o 




,0.9 



(c) V 



'^^^Jan nmCT a wrJ W ffB W» . o.8a8^ftS ge#ag^|i|y^v^ 



10 1 2 1 

10 10 10 10^ 

scale s 



10 



10 




10 10^ 

scale s 



Fig. 1. (a) Fluctuation functions F{s) versus scale s of long-range correlated data 
with different scaling exponents a = 0.3, 0.7, 1.2, by means of DFAl (circles), CMA 
(triangles) and MDFAl (squares). The results have been obtained by averaging 
F'^{s) over 100 artificial series of length N = 50000 for each method and scaling 
exponent. The DFAl curve is shifted downwards for clarity, (b)-(d) Point to point 
derivative of the average fluctuation functions shown in Fig. 1(a) for (b) a = 0.3, 
(c) a = 0.7 and (d) a = 1.2. Note the deviation from the scaling behaviour for small 
and large scales for DFAl and MDFAl. 

rule of determining a only up to a scale of N/4 [8]. Nevertheless, Figs. 1(b)- (d) 
show that the scaling behaviour of CMA is more stable than for DFAl and 
MDFAl, suggesting that CMA could be used for reliable computation of a 
even for scales s < 10 and up to Smax = N/2. 

An important topic in fluctuation analysis is the influence of the signal length 
upon the reliability of the estimated scaling behaviour. For this purpose, we 
applied DFAl, CMA and MDFAl on long-range correlated data and calcu- 
lated the mean and standard deviation of the scaling exponents (a, cr(a)) as 
function of the signal length A^ (Figs. 2 (a)-(c)). There are two ways in deflning 
the scaling range for the fltting procedure of a. Firstly one can flx the lower 
limit to s = 10 (in order to reduce the influence of the small scales, where a is 
overestimated by DFAl and MDFAl, see Figs. l(b)-(d)). The upper limit in 
this "flxed lower limit" range is set to N/2 here. Figure 2 shows the result for 
this flrst deflnition. As can be clearly seen, the exponents become more accu- 
rate if a larger scaling range is used in the fltting procedure. While CMA and 
DFAl show similar results and systematically underestimate the real scaling 
exponent for very short data (A^ < lOOJiJ, iSmdfa is quite stable. However, 



^ This outcome is rather surprising, since from Fig. 1 one would expect that DFAl 
and MDFAl overestimate a for short data (due to the deviations on small scales). 
However, it turns out that one has to take into account the different averaging 
procedures. In Fig. 2 we simply average over all calculated exponents for given data 
lengths, which have a large standard deviation for short series. On the other side, 
in Fig. 1, the fluctuation functions are averaged non- logarithmically. Configurations 
with large values thus affect the means much more than configurations with small 
values. This averaging procedure favours larger slopes, since the variations of F{s) 
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Fig. 2. (colour online) Mean and standard deviation of the calculated scaling expo- 
nents for different signal lengths A^ and histograms of scaling exponents. The fitting 
range was fixed at a lower limit namely s € [10,A^/2]. We applied (a) DFAl, (b) 
CMA and (c) MDFAl to 1000 generated series with a = 0.7. In (d)-(f), the cor- 
responding normalized histograms of the scaling exponents are shown for A^ = 50 
(red colour) and N = 5000 (black colour). Note that the distribution of odfa is 
asymmetric for A^ = 50. 

o"(ttMDFA) is significantly increased for A^ < 100 (see also Fig. 3). 

An alternative definition of a is based on a moving fitting regime with "fixed 
width", e.g., from N/20 to N/2. In this case, (j{(y.) is practically independent 
of A^ (not shown). Since both definitions are identical for N = 200, the results 
in Fig. 2 for A^ = 200 are valid also for larger N in case of the fixed width 
definition. In the following, we will only refer to a fitting range with fixed 
lower limit. 

An interesting question when studying the behaviour of a {a) versus A^ is 
whether the variations of a are due to fluctuating properties of the data or 
due to the inaccuracy of the methods. It is hard to clarify this, since there is no 
way to identify or define a scaling exponent for any data without applying an 
analyzing method. Here we use DFA as such reference method. In the Fourier 
Filtering Method (see above) the data is generated by manipulating the slope 
in the power spectrum, i.e. /?, which is directly related to the exponent a 
(see Section 3.2 and [51]). Nevertheless, the shorter a time series, the less 



are generally larger for larger s. Further on, the larger slopes occur for s < 10 in 
Fig. 1(b), while the fitting range is set to 10 < s < 25 for the first point in Fig. 2(a). 
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Fig. 3. Standard deviation of a versus signal lengths N for DFAl (circles), CMA 
(triangles) and MDFAl (squares). This figure can be used as a calibration curve, 
i.e. to estimate the uncertainty, a, of a depending on the signal length A^. For each 
method we averaged over 1000 series with a = 0.7. 

well-defined its intrinsic exponents a and (3 become. The power spectrum and 
DFA fiuctuation function become less smooth as a time series becomes shorter, 
increasing the error in calculating (and already defining) the exponents. The 
less smooth the curves are, the less accurate are the exponents defined and 
different methods will yield different results. This is essentially not an error in 
one or the other method. 
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Fig. 4. (colour online) (a) Correlation scattering plots of the scaling exponents 
calculated with DFAl and CMA for A = 50 (black), N = 200 (red) and N = 5000 
(green), (b) Corresponding standard deviations SDl (black stars) and SD2 (red 
plus signs) as defined in Eqs. (10) versus N. Note that SDl decreases faster with 
N than SD2, suggesting that the uncertainty of CMA and DFAl decreases faster 
with N than the indeterminacy of data generation; for comparison see the dotted 
line: SD ~ (In A)"^'^. The results of 1000 series with an imposed value of q = 0.7 
are shown. 

In order to get an impression of the uncertainty of the methods and the error in 
generating data sets with a certain scaling exponent, one can study correlation 
scattering plots (Figs. 4 and 5). The standard deviations to characterize such 
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Fig. 5. (colour online) Same as Fig. 4 but for the scaling exponents calculated with 
DFAl and MDFAl. Similar plots are obtained for CMA versus MDFAl (not shown). 

plots are defined by [52] 



SD2 



iEi=i^[(aDFA -<)-(" 



DFA 



«?.)]^ 



jj Ell i[(«LFA + «j,) - («bFA + 4)]'' 



(10) 



so that SDl {SD2) is the standard deviation perpendicular (parallel) to the 



hne given by odfa 



a 



yi 



where a„ is the scaling exponent calculated by 



method y. Assuming that Uy is composed of the intrinsic scaling exponent 
(ky of method y and an error Aa because of data generation, it can be seen 
from Eq. (10) that SDl eliminates Act and thus may give a hint about the 
accuracy of method yliJ If the considered method calculates exactly the same 
scaling exponent as DFAl, SDl would vanish. On the other hand, if method 
y deviates from DFAl, SDl will become large, i. e. comparable with SD2. 
Consequently, the indeterminacy of data generation can be assessed by the 
difference between SD2 and SDl. 

Figures 4 and 5 show that SDl is clearly smaller than SD2 suggesting that 
the errors from data generation are larger than the deviations of both, CMA 
and MDFAl results from DFAl results. In addition the decay of SDl is faster 
than (lnA^)~^/^ while the decay of SD2 is slower than this. 



3.2 Determination of crossovers 



An often observed phenomenon in real world data sets is the occurrence of 
crossovers, i.e., the correlations of the recorded data do not follow the same 
scaling law for all time scales s. Crossovers occur, for example, in the analysis 



^ Alternatively, if one does not compare method y to a reference method, such as 
DFAl in this case, an additional error by estimating a should be taken into account. 
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Fig. 6. (a) Fluctuation functions F{s) of DFAl (circles), CMA (triangles) and 
MDFAl (squares) versus time scales s for data with a = 0.8 for s < s^ and a = 0.5 
for s > Sx (here Sx = 200). The results have been obtained by averaging 200 data 
series of length N = 100000 for each method. Note that for the sake of clarity F{s) 
was divided by s^". (b) Point to point derivative of F{s) shown in (a). 



of short-term correlated data with finite decay time. Hence, an exact detec- 
tion of crossovers is essential for finding characteristic time scales in complex 
systems. To compare the performance of DFAl, CMA, and MDFAl regarding 
the detection of crossovers, we applied these methods to artificial time series 
with a well-defined crossover at scale Sx- The results are shown in Fig. 6. 



It is sufficient to study only one scenario of a crossover in artificial data since 
the systematic deviation of the observed crossover from the real crossover was 
found to be independent of the values of a for DFA [8]. A convenient way 
to generate such time series is by using a modification of the Fourier filtering 
method. If we need a crossover at scale Sx with scaling exponents ai for s < Sx 
and a2 for s > s^, the power spectrum of an uncorrelated random series is 
multiplied by {f / fx)~^'^ for low frequencies f < fx = ^/sx and with {f/fx)"^^ 
for frequencies f > fx- The relation between a and [3 is given by f3 = 2a — 1 
[51]; the inverse Fourier transform of the manipulated power spectrum yields 
the desired data. 



Figure 7 shows our results for generated data with systematically varied real 
crossover Sx- While DFAl and CMA slightly overestimate the position of the 
crossover (s'^ > Sx) by the same degree, MDFA detects Sx rather accurately. 
Clearly, a linear relationship between Sx and s'x is observed. Hence, when 
observing a crossover at position s'^ in DFAl, CMA, or MDFAl, the real 



crossover position Sx 
of Fig. 7. 



can be estimated by the equations given in the caption 
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Fig. 7. Real crossovers Sx versus observed crossovers s'^ for DFAl (circles), CMA 
(triangles) and MDFAl (squares). The results have been obtained by averaging over 
the same number of configurations as in Fig. 6 for each Sx- The Sx values can be 
estimated from s'^ by Insx w In s'^ -0.25 (DFAl), Insx «^ 1.05 In s'^ -0.47 (CMA), 
and Insx ~ 1.041nSx - 0.19 (MDFAl), respectively. 

3.3 Data with monotonous trends 



Trends are ubiquitous in many noisy signals obtained from real systems. As 
it was discussed above and shown in previous work, trends may mask the 
real correlation behaviour of the intrinsic fluctuations in the data. To study 
the effect of trends in DFA, CMA and MDFA, we have added a linear and a 
non-integer trend to the original record (xi) generated with the Fourier trans- 
form method. Other kinds of trends (polynomial, sinusoidal and irregularly 
oscillating) have been systematically studied elsewhere [8,9,42,43,44]. 

Figure 8 depicts our results after adding a linear trend to long-range correlated 
data (with (x) = and a{x) = 1 before adding the trend). Since DFAl, 
CMA and MDFAl are, by definition, not able to remove linear trends in the 
original data, all F{s) curves show trend induced crossovers at s'^, which occur 
slightly earlier for MDFAl. Above the crossover, an artificial scaling exponent 
c^trend = 2 is observed in agreement with [8]. A systematic variation of the 
strength of the trend A shows that the crossover position s'^ increases with 
A as s'x ~ ^~^ with an exponent 6 ~ 0.71, independent of the technique (see 
inset of Fig. 8(a)) and also independent of the fiuctuation exponent a (not 
shown). This scaling relation allows to extrapolate for smaller values of A. 
For comparison, the fiuctuation function F{s) is also shown for DFA2. In this 
case, we clearly observe the expected scaling behaviour without any crossover. 



The application of a non- integer trend, e. g. x[ = Xi + A{i/Ny-^ leads to 
similar results as shown in Fig. 8 (not shown, see also [8] for DFA). However, 
here we also observe a trend related crossover for DFA2. 
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Fig. 8. Fluctuation functions and point to point derivative of long-range correlated 
data sets with a linear trend: x[ = Xi + Ax with x = i/N (a = 0.65, A = 10). 
(a) Trend-related crossover after analysis by means of DFAl (circles, s'^ ~ 187), 
CMA (triangles up, s'y. ~ 186) and MDFAl (squares, s'y. ~ 170). For comparison, 
F(s) versus s is shown also for DFA2 (triangles down). Since DFA2 eliminates 
the linear trend in x'^ we find the expected scaling behaviour F(s) ~ gO-^s without 
crossover (the DFA curves were shifted downwards for better visibility) . In the inset 
the position of the crossover is shown for different strengths A of the trend for all 
three methods; the data can be fitted by s'y. oc A~^ with an exponent 6 ^ 0.71, 
independent of the technique, (b) Point to point derivative of the F{s) functions 
shown in (a). The results were obtained by averaging 100 data series of length 
N = 100000. 



4 Conclusion 



In summary, we have compared several recently suggested detrending meth- 
ods based on random walk theory which were developed to detect long-range 
correlations in data affected by trends. In particular, we investigated the per- 
formance of the Centered Moving Average (CMA) method and the Modified 
Detrended Fluctuation Analysis (MDFA) regarding their behaviour on small 
and large scales, and the determination of crossovers in monofractal data sets 
with different lengths and monotonous trends. A systematic comparison of 
CMA and MDFA with standard DFA showed a small advantage of CMA 
in the computation of the scaling behaviour on small (s < 10) and large 
{s > N/4) scales. The detection of crossovers in the data was somewhat more 
exact with MDFA. Ultimately, we think that CMA is a good alternative to 
DFAl when analyzing the scaling properties in short data sets without trends. 
Nevertheless for data with possible unknown trends we recommend the appli- 
cation of standard DFA with several different detrending polynomial orders to 
distinguish real crossovers from artificial crossovers due to trends. In addition, 
an independent approach (e.g., wavelet analysis) should be used to confirm 
findings of long-range correlations. 
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